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RULE 131 DECLARATION OF MINGQIU SUN 



Mail Stop Amendment 
Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 



I, Mingqiu Sun, hereby declare and state: 

I am the named inventor of the above-referenced patent 
application. 

I am advised that the United States Patent & Trademark Office 
recendy issued an Office action dated March 21, 2006, rejecting claims 1- 
11, 13, 14, 17-20, 26-37, 41-52, 54, 58, 59 and 61-64 of the above- 
referenced patent application as unpatentable over Sherwood et al, Phase 
Tracking and Prediction , which was published in June of 2003 in the 
Proceedings of the 30* International Symposium on Computer 
Architecture (ISCA) (hereinafter "Sherwood''). 

I invented the inventions claimed in the above-referenced patent 
application prior to the June 2003 publication of Sherwood. For example, 
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U.S- Serial No. 10/608324 
Rale 131 Declaration 

Exhibit A is a copy of the invention disclosure form I prepared to relate the 
inventions claimed in the above referenced patent application to Intel's 
legal department That invention disclosure form is dated 

- a date prior to June of 2003. As documented in Exhibit 

A, I completed an implementation and simulation corresponding to the 
inventions claimed in the above-referenced patent application prior to the 
date on which I prepared Exhibit A, which, as noted above, is prior to June 
of 2003. Thus, I actually reduced my invention to practice prior to June of 
2003, and the Sherwood publication is not prior art to my inventions. 
4. The above-referenced patent application is a cemtmuation-tn-part of 

my prior US Patent Application Serial No. 10/424,356. The invention 
claimed in the above-referenced patent application, thus, builds on the 
inventions disclosed in that parent application. Exhibit B is the invention 
disclosure form I completed for that parent application. Exhibit B is dated 

, a date prior to January 2, 2004, and provides further 

documentary evidence thai the inventions claimed in die above-referenced 
patent application were conceived and reduced to practice prior to the 
publication of Sherwood, Accordingly, Sherwood is not prior art to my 
inventions. 
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U.S. Serial No. 10/608,324 
Rule 131 Declaration 

5. I understand that willful and false statements and the like are 

punishable by a fine and/or imprisonment under 18 US.G § 1001, and 
that such willful false statement may jeopardize the validity of this 
application and any patent resulting therefrom. 



Date: 




Mingqiu Sun 
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Exhibit A 
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INTEL INVENTION DISCLOSURE 

ATTORNEY-CLIENT PRIVILEGED COMMUNICATION 
located at http://legal.intel.com/patent/index.htm 



DATE: 



SOFTWAKE/EPG/SSG/CSD 



It is important to provide accurate and detailed information on this form. The information will be used to evaluate 
your invention for possible filing as a patent application. When completed and signed, please return this form to 
the Legal Department at JF3-147. You can submit electronically via e-mail to "invention disclosure submission' 
if all of the information is electronic, including drawings and supervisor approval. If you have any questions, 
please call 264-0444. 



1. Inventor: Sun 



Minqqiu 



Last Name 



Phone 503-712-6066 



M/S: JF1-239 



First Name 

Fax# 503-264-4904 



Middle Initial 



Citizenship: USA 



WWID: 10080247 



Contractor: 



YES 



NO X 



Inventor E-Mail Address: Minqqiu.sun@intel.com 

Home Address: 15360 SW Kiwanda Lane 

City Beaverton 

"Corporate Level Group (e.g. I AG, ICG, NBG) EPG 

Supervisor* Joe Daly WWID 10031186 



State OR Zip 97007 



Country USA 



Division SSG 



Subdivision CSD 



Phone 503-264-2031 M/S: JF1-239 



Inventor: 



Phone 



Last Name 



M/S: 



Citizenship: 

Inventor E-Mail Address: 

Home Address: 

City 



WWID: 



State 



Zip 



First Name 
Fax# 



Middle Initial 



Contractor: YES 



NO 



Country 



'Corporate Level Group (e.g. I AG, ICG, NBG) 

Supervisor* WWID . 



Division 



Phone 



Subdivision 

M/S: 



*lf you are unsure of this information, please discuss with your manager. 
(PROVIDE SAME INFORMATION AS ABOVE FOR EACH ADDITIONAL INVENTOR) 

2. Title of Invention: A Memory Prefetching Algorithm Based on Program State Prediction 

3. What technology/product/process (code name) does it relate to (be specific if you can): 

MRTE architecture and capability 

4. Include several key words to describe the technology area of the invention in addition to # 3 above: program states, program 

phases, phase transition, information entropy, transaction, program trace, instruction-level instrumentation, MRTE workloads 



5. Stage of development (i.e. % complete, simulations done, test chips if any, etc.): Algorithm design completed, 
implementation completed, simulation completed. 



6. (a) Has a description of your invention been, or will it shortly be, published outside Intel: 

NO: _X YES: If YES, was the manuscript submitted for pre-publication approval? 



IDENTIFY THE PUBLICATION AND THE DATE PUBLISHED:, 



(b) Has your invention been used/sold or planned to be used/sold by Intel or others? 
NO: X YES:_ DATE WAS OR WILL BE SOLD: 



REDACTED ATTORNEY-CLIENT PRIVILEGED COMMUNICATION 

(c) Does this invention relate to technology that is or will be covered by a S1G (special interest group)/standard/ 
or specification? 

NO: X YES: Name of SIG/Standard/Specification: 



(d) If the invention is embodied in a semiconductor device, actual or anticipated date of tapeout? N/A 

(e) If the invention is software, actual or anticipated date of any beta tests outside Intel N/A 



Was the invention conceived or constructed in collaboration with anyone other than an Intel blue badge employee 
or in performance of a project involving entities other than Intel, e.g. government, other companies, universities 
or consortia? NO: _X YES: Name of individual or entity: 



8. Is this invention related to any other invention disclosure that you have recently submitted? If so, please give the title and 
inventors: Yes. "Method and Apparatus for Detecting Repetition Patterns of Program Execution States" by Minqqiu Sun 



A**************************************- 

PLEASE READ AND FOLLOW THE DIRECTIONS ON 
HOW TO WRITE A DESCRIPTION OF YOUR INVENTION 

Please attach a description of the invention to this form and include the following information: 

1. Describe in detail what the components of the invention are and how the 
invention works. 

Components: 

This disclosure describes a memory prefetch methodology based on program state prediction. It 
builds on top of the program state prediction framework (see related filing). In addition to the 
sampler, signature calculator and entropy predictor, there is a prefetcher component. 

In this work, the program state prediction framework is applied to the IP counter part of memory 
reference traces. 

The basic algorithm is to associate a memory profile with each program state, and prefetch this 
profile associated with a next state or states. Basic steps are: 

1 . Associate each state with a memory profile 

a. The profile may be implemented as an array of memory references 

b. This profile is updated throughout the run 

c. The profile may consist of a subset of all memory references associated with the 
states. Additional filters include: recent seen filtering, miss filtering, etc. 

2. Use entropy calculation described in the framework disclosure to predict next state 

a. Each entropy contains a profile of possible next states 

b. This entropy profile is updated throughout the run 

3. Issue memory references based on entropy profile. Several strategies or combination 
may be employed in prefetching: 

a. Next state (most probable state) 

b. Next states (all probable states) 

c. None if entropy exceeds a threshold 

d. Next state for small entropy value, but all states for large entropy value 
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The following diagram shows a prefetching scenario. Initially, state 1 is selected by program, 
out of possibilities of 1, 6, 11, 18 and 21 . This corresponds to the start of a transaction, due to a 
high uncertainty associated with the previous state. Based on an entropy profile, we know the 
next states are 2 and 3, with 3 more likely. Hence, we issue a prefetch for the memory profile 
associated with state 3. This demonstrates intra-transactional variance. Next, the transitions to 
state 4 and 5 are 100% deterministic, which makes prefetching highly effective. However, state 
5 has a high entropy value, which signals an end to the current transaction. 
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Does it work? 

This algorithm works very well with good improvement in L2 cache miss ratio for SPECjbb (20% 
improvement) and ECperf (15% improvement) in simulation of a Northwood memory model: 
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Phase Guided Memory Prefetch Simulation 




Memory access 



2. Describe advantage(s) of your invention over what is done now. 

Our approach has similarity to what is described as Markov prefetcher in literature. However, the 
similarity is limited at predicting the next state only (see the framework disclosure for 
explanation). The biggest difference is in our association of a profile of memory references to a 
program state. This allows effective prefetching of clustered memory references. Since our state 
depth is typically in thousands of instructions, it gives an adequate time window for prefetching to 
work. 

Another advantage of our approach is in the entropy calculation that is a systematic measure of 
transitional uncertainty. Besides the insight it provides as a transactional demarcation mark, it 
enables additional prefetching policies such as 3c and 3d listed in section 1. 

3. YOU MUST include at least one figure illustrating the invention. 

If the invention relates to software, include a flowchart 
or pseudo-code representation of the algorithm. 

The following diagram shows a high level flow. The function of each of the four components has 
been described in the preceding section already. 
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The following diagram demonstrates main data structures and relevant parameters in our 
reference software implementation. As one can see that the most expensive part is a memory 
array for storing memory profiles associated with every state. 




4. Value of your invention to Intel (how will it be used?). 

The invention could be used in both software and hardware. For example, a static compiler could 
exploit the predictable repetitive behavior by generating speculative threads to prefetch memory 
references associated with the next program states; it may also be used in a MRTE JIT engine to 
prefetch memory references based on dynamic profiling. The optimization technique may also be 
implemented directly inside a microprocessor to achieve similar performance optimization goals. 

5. Explain how your invention is novel. If the technology itself is not new, 

explain what makes it different. 

This invention is novel in leveraging our new program state prediction framework to provide 
memory performance benefit. 

6. Identify the closest or most pertinent prior art that you are aware of. 

Markov prefetcher paper: http://systems.cs.colorado.edu/Papers/Architecture/ISCA97- 
MarkovPrefetch/ 

Hot data stream perfetcher, which uses Sequitur compression to match memory access patterns 
and issue prefetches based on seen patterns: 
http://research.microsoft.com/~trishulc/papers/prefetch hds.pdf 

7. Who is likely to want to use this invention or infringe the patent 

if one is obtained and how would infringement be detected? 

Compiler, MTRE virtual machine writers and CPU designers are potential users of this invention. 
There should be noticeable performance improvement if implemented correctly. A careful 
performance comparison analysis should provide evidence and help infringement detection. 

HAVE YOUR SUPERVISOR READ, DATE AND SIGN COMPLETED FORM 
OR FORWARD IT ELECTRONICALLY VIA E-MAIL TO "INVENTION DISCLOSURE SUBMISSION" 



DATE: SUPERVISOR: 



BY THIS SIGNING, I (SUPERVISOR) ACKNOWLEDGE THAT I HAVE READ AND UNDERSTAND THIS 
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ATTORNEY-CLIENT PRIVILEGED COMMUNICATION 
located at http://Iegal.intel.com/patent/index.htm 
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It is important to provide accurate and detailed information on this form. The information will be used to evaluate 
your invention for possible filing as a patent application. When completed and signed, please return this form to 
the Legal Department at JF3-147. You can submit electronically via e-mail to "invention disclosure submission' 
if all of the information is electronic, including drawings and supervisor approval. If you have any questions, 
please call 264-0444. 



1. Inventor: Sun 



Minqqiu 



Last Name 



Phone 503-712-6066 



M/S: JF1-239 



First Name 

Fax# 503-264-4904 



Middle Initial 



Citizenship: USA 



WWID: 10080247 



Contractor: YES 



NO X 



Inventor E-Mail Address: Minqqiu.sun@intel.com 

Home Address: 15360 SW Kiwanda Lane 

City Beaverton 

"Corporate Level Group (e.g. IAG, ICG, NBG) EPG 

Supervisor* Joe Daly WWID 10031186 



State OR Zip 97007 



Country USA 



Division SSG 



Subdivision CSD 



Phone 503-264-2031 M/S: JF1-239 



Inventor: 



Phone 



Last Name 



M/S: 



Citizenship: 

Inventor E-Mail Address: . 
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Zip 



First Name 
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Middle Initial 



Contractor: YES 



NO 



Country 
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Supervisor* WWID. 
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M/S: 



*lf you are unsure of this information, please discuss with your manager. 
(PROVIDE SAME INFORMATION AS ABOVE FOR EACH ADDITIONAL INVENTOR) 

2. Title of Invention: Method and Apparatus for Detecting Repetition Patterns of Program Execution States 

3. What technology/product/process (code name) does it relate to (be specific if you can): 

MRTE architecture and capability development 

4. Include several key words to describe the technology area of the invention in addition to # 3 above: program states, program 

phases, phase transition, information entropy, transaction, program trace, instruction-level instrumentation, MRTE workloads 



5. Stage of development (i.e. % complete, simulations done, test chips if any, etc.): Algorithm design completed, 
implementation completed, simulation completed. 



6. (a) Has a description of your invention been, or will it shortly be, published outside Intel: 

NO: _X YES: If YES, was the manuscript submitted for pre-publication approval? 



IDENTIFY THE PUBLICATION AND THE DATE PUBLISHED:. 



(b) Has your invention been used/sold or planned to be used/sold by Intel or others? 
NO:^< YES: DATE WAS OR WILL BE SOLD: 



REDACTED attorney ^uent privileged communication 

(c) Does this invention relate to technology that is or will be covered by a SIG (special interest group)/standard/ 

or specification? 

NO: X YES: Name of SIG/Standard/Specification: 

(d) If the invention is embodied in a semiconductor device, actual or anticipated date of tapeout? N/A 

(e) If the invention is software, actual or anticipated date of any beta tests outside Intel N/A 

7. Was the invention conceived or constructed in collaboration with anyone other than an Intel blue badge employee 
or in performance of a project involving entities other than Intel, e.g. government, other companies, universities 

or consortia? NO: _X YES: Name of individual or entity: 

8. Is this invention related to any other invention disclosure that you have recently submitted? If so, please give the title and 
inventors: No 



PLEASE READ AND FOLLOW THE DIRECTIONS ON 
HOW TO WRITE A DESCRIPTION OF YOUR INVENTION 

Please attach a description of the invention to this form and include the following information: 

1. Describe in detail what the components of the invention are and how the 
invention works. 

Problem statement: 

^ There are repetitive behaviors and patterns at the transaction level in transaction-oriented 
workloads (SPECjbb, ECperf, ...). 

❖ The repetitive behavior can be described quantitatively. 

♦> The repetitive behavior can be exploited for performance benefit. 
•> The basic unit of such repetition is a transaction/sub-transaction, or one pathlength 
measured by instructions. 
^ Scales of different repetition patterns 
•> Loops: repetition of instructions 

♦> Phases: repetition of transactions focus of this disclosure 

❖ Runs: repetition of programs 

Basic framework definitions: 

•> A program state is defined as a collection of information of executed instructions in a 
given time window. 

❖ A signature is an approximation of a program state by mapping of its program counter 
(PC) and/or memory addresses of executed instructions into a fixed-length set of bits. 

❖ State transition happens when signature bit pattern change exceeds a certain 
threshold. 

❖ Entropy: informational entropy associated with a program state describes transitional 
uncertainty to the next program state. 

❖ Phase boundary is detected by a spike in transitional entropy. 

❖ Significant intra-transaction variance may give rise to sub-transactions. 

Components: 

Three major components in this invention are Trace sampler, Signature calculator and Entropy 
predictor. 

♦ Trace sampler 

The function of this component is to feed the signature calculator with raw program execution 
trace. There can be a couple of mechanisms for sampling program execution trace involving 
either hardware counters or software instrumentation. Examples are CPU performance counters 
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and MTRE instrumentatio typical traces that are useful for de r ng program execution states 
are sequences of basic blocks (PC) and memory operations (PC, Addr). 

Although a necessary component for this technology to work, actual techniques involving usage 
of CPU performance counters or software instrumentation are not covered in this disclosure. 

& Signature calculator 
The function of this component is to process trace and calculate signature that corresponds to 
the current program state. This is an improved version over Dhodapkar & Smith. In example 
implementations, we use a sliding window to map basic block trace (PC) and/or PC and memory 
addresses associated to memory operations to map to an n-bit vector signature with a random 
hashing function. The mapping may be made inderterministic with an exponential decay function. 
The following diagram demonstrates the process: 



Here b is the basic unit or resolution in our model. Signature difference is calculated as: 
A = | S1 XOR S2 | / | S1 OR S2 | 

A threshold can be established to delineate program states. A transition to a new state happens 
when a signature difference exceeds the threshold. 

♦ Entropy predictor 

Entropy predictor is a component that compiles transitional entropy profiles of each program 
execution state based on past history, and makes a prediction for the next program state based 
on the state's transition probability distribution. This entropy value may also be used as a 
demarcation mark between macroscopic transactions, due to the maximum transitional 
uncertainty at the last state of a transaction. 

The following formula defines informational entropy: 

H = -K I (P., * log Pi), 

where Pjis the transitional probability to state / from the current state, and K is a constant. The 
probability is calculated via exponential moving averaging throughout the run. 

Does it work? 

This methodology works very well with good accuracy in predicting the program states of MRTE 
workloads. The following table shows the prediction accuracy for SPECjbb and ECperf. "Top 1 
state" means using the top state in the entropy profile to predict. "Top 3 states" means using all 
the top 3 states in the entropy profile to predict. 
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SPECjbb predict! accuracy: 

top 1 state - 80% 

top 3 states — 94% 
ECperf prediction accuracy: 

top 1 state — 52.4% 

top 3 states - 73.5% 

The following diagram demonstrated program states and their associated entropy values. Y-axis 
denotes states, and x-axis denotes memory accesses, which are proxy to time. Each spike in 
entropy corresponds to the ending state of a transaction or sub-transaction. 
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2. Describe advantage(s) of your invention over what is done now. 

Predicting program execution state is a known difficult problem in the industry. Although there 
are attempts to exploit repetitive structures such as loops to prefetch, those methodologies are 
largely limited to highly, regular and simple workloads such as scientific codes. Effectively 
predicting program states for general-purpose workloads such as SPECjbb and ECperf remains 
an open problem. 

Our signature calculator is an improved version of Dhodapkar & Smith, in that we use a sliding 
window, instead of a pre-determined fixed window, to process trace to better adapt changing 
program behavior and to detect repetition patterns in arbitrary time scale. We also introduced 
exponential decay in mapping to an n-bit vector, which proved very helpful in reducing the 
requirement of a large vector. This is especially useful for defining memory address states, 
where memory address space is constantly expanding during execution. 

A Markov predictor has probability distribution model based past values, which is similar to our 
approach. For example, Joseph and Grunwald proposed a Markof prefetcher based on cache 
miss events. However, its finite state model based on cache miss is not proven to be true for all 
workloads. Also a state based on a single cache miss is usually too short to be useful for 
prefetching. In contrary, our program state definition based on signature is in the order of 10 3 
instructions, which provide adequate time window for prediction and prefetch to be effective. 

In addition to individual component level improvement, we provide a better combination of 
program state definition and a predictor based on a well-defined finite state machine which is 
demonstrated by high prediction accuracy with SPECjbb and ECperf . 

Finally, we provide a self-consistent theoretical framework that does not exist before. The notion 
of program states defined by signature, and program phases determined by spikes in entropy, 
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allows natural mapping fr- phases, which are trace-based mic copic constructs, to business 
transactions, which are macroscopic programming constructs. Tn.s characterization provides a 
demarcation mechanism for macroscopic transactions for the first time. 



3. YOU MUST include at least one figure illustrating the invention. 

If the invention relates to software, include a flowchart 
or pseudo-code representation of the algorithm. 

The following diagram shows a high level flow. The function of each of the three components has 
been described in the preceding section already. 




The following diagram demonstrates main data structures and relevant parameters in our 
reference software implementation. 
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4. 



Value of your invention to Intel (how will it be used?). 



The invention could be used in both software and hardware. For example, a static compiler could 
exploit the predictable repetitive behavior by generating speculative threads to prefetch memory 
references associated with the next program states; it may also be used in a MRTE JIT engine to 
prefetch memory references based on dynamic profiling. The optimization technique may also be 
implemented directly inside a microprocessor to achieve similar performance optimization goals. 

5. Explain how your invention is novel. If the technology itself is not new, 
explain what makes it different. 

The invention is novel in that it solves a known difficult problem in the industry, through 
establishment of a theoretical framework with definitions of states, state transitions, phases, and 
phase transitions, and an entropy-based program state prediction algorithm with good accuracy. 
We also established a connection between microscopic phases in trace analysis and 
macroscopic transactions in business workloads. 

Although we use MRTE workloads as examples in this work, we believe the technology is 
applicable to a wide variety of server-oriented workloads. 



6. 



Identify the closest or most pertinent prior art that you are aware of. 
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Dhodapkar & Smith pape ) signature calculation: 
http://www.cae.wisc.edu/~dhodaDka/iscaQ2.pcif 

Eric Sprangle at Intel DPG has also done work on signature calculation based on the Dhodapkar 
& Smith methodology. 

Markov prefetcher paper: http://svstems.cs.colorado.edu/Papers/Architecture/ISCA97- 
MarkovPrefetch/ 

7. Who is likely to want to use this invention or infringe the patent 
if one is obtained and how would infringement be detected? 

Compiler, MTRE virtual machine writers and CPU designers are potential users of this invention. 
There should be noticeable performance improvement if implemented correctly. A careful 
performance comparison analysis should provide evidence and help infringement detection. 

HAVE YOUR SUPERVISOR READ, DATE AND SIGN COMPLETED FORM 
OR FORWARD IT ELECTRONICALLY VIA E-MAIL TO "INVENTION DISCLOSURE SUBMISSION" 



DATE: SUPERVISOR: 

t — 



BY THIS SIGNING, I (SUPERVISOR) ACKNOWLEDGE THAT I HAVE READ AND UNDERSTAND THIS 
DISCLOSURE, AND RECOMMEND THAT THE HONORARIUM BE PAID 
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